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l.e INTRODUCTION 



The Dolphin central processor consists of three or four major 
subsystems. They are the Ebox, which performs the traditional tasks 
of an arithmetic central processor, the Ibox, which prefetches and 
decodes instructions for the Ebox, the Mbox, which fetches data from 
main memory and caches it for the above units and for the optional 
FPA (Floating point accelerator), which provides high speed 
arithmetic processing of floating point operands in parallel with 
the Ebox. The Ebox, Ibox and FPA combined are called the "Pbox" 



The Dolphin Mbox is composed of four 
units, the microcontroller, the data cache, 
and the bus interface. 



principal functional 
the page table cache. 
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TECHNOLOGY 

2.0 TECHNOLOGY 

The Dolphin Mbox will occupy two Extended Hex .multilayer etch 
modules. It will use ECL 256 x 4 bit memories and Macro cell array 
LSI logic circuitry. There will be approximately 35 Macro cell 
arrays and 162 memories including the Bus interface. (There are 10 
Macro cell array part types including the Bus and ECC parts.) A 
small number of small scale integration parts will be used, 
primarily as buffers and parity checkers. 



3.0 MBOX FUNCTIONS AND FEATURES 

1. 2048 word cache of Pbox instructions and read and write 
data. 

2. Virtual address translation by a 512 entry page table 
cache. 

3. Con.trol and interface to the Dolphin Bus for interrupts and 
memory requests. 

4. Error recovery using ECC and special hardware and software 
on all memories and interboard bus signals. 

5. A microcontroller to handle bus operations, cache refills, 
cache sweeps, and word invalidates, and other operations 
traditionally performed by hard wired logic. 

6. High performance symmetrical processor support is designed 
in. Multiple processors' caches are automatically kept up 
to date with the optional shared pages data integrity box 
which plugs into the Dolphin bus. 
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4.0 PRINCIPAL OF OPERATION 

4.1 Overview 

control' ?ogic''?o'"reaMze"hiah'%n"^''"'Jr ^^^^ ^^^^^ — ^s of 
control logic. This is an «v^^- ^^t^ *"^^ ^ minimum of unchecked 
the KL10 ^mS^x. In the KL?r'°nn?' It' P^^f^^^PhV of design of 
paging were controlled bj the Eb^x miJro^nL'''^'"^^P^'^"''°"" ^^^ *^^ 
all paging is controlled by the Ebox r.ch"''^ """''^^'^ ^'^°^' 
operations and Dolphin bus control L..^ ^^^^ ^''^^P' ^'^i^eback 
Mbox micro controller ^^^^trol operations are controlled by the 



4.2 General Description 

•^eyed by physical adSrJss! The ^hysi^ar'addrf,^'" " ' \'' '''^'' 
expanded from 22 bits hn 77 Ki»„P^ "^. ^'3<?'^ess size has been 

size of the 6ache directory"roi f5 to^lT b^J=' "! increase .n the 
number. An autonomous L^h. i ■ ^?'^ °* physical page 

invoKed by the'saL"°ins rue? Sn ITT.' the "uf linSif "'^^ '^ 
objects) . ^ KLiM (unless someone 

increas?nf "?he"'pLe%lEll"«^hr'i'l; '" °'^9^"i«'3 as 512 X 1 X 1, 
ise ^:r -J h%^'-/--"."- -'-^""d°;L^rihe^^^?;i/-t^ ?he^ 

hai e.";:nd^^rtr3r?i?^°pl=r^ ^ rr?hl-dif^^tlrrf:j"t"h^ "''^" 
f^'thl. "l"^ the physical page entr^ no' ?otal 1 "^bits ThI llollvl 
in these two caches happen simultaneously, as in the KLla no 
paging algorithm is implemented in tho iih„. i ,V •^^'■'' ■ No 
misses or access control Tt^V.D' ^ ''°°*' ""'^ ^'^ paging cache 
T0PE-2B pfg"g on thi k?Ib tK %t «P°"ed to the EBOX. As in 
appropria?r(?OPS-Ja or tops 5S, l^^ '^''Jk """" "^ ""^^ *° «°"°" 'he 
ca?he%itho^rfes"r?lng™the"1L?J^ctlon"' ^"' "^'^ "^ ^^^'"5 

•■BraiJ = tor^?^f?"" """ '"^ principles agreed on at the 16 June 
memr=lHARSD22 otirjuly 978, ';he*ibr ""f? ""'°"" ""^ '^ """^ 

^AOS^":?"?^'^ cacSe'^J?r"s"''Th'^%rt 111111^1 h^rintl^l^^K^ 
(AOS, etc.) are handled may be found in these memos? in'^-^locks 
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PRINCIPAL OF OPERATION 

The Mbox will handle the Dolphin bus related part of the 
processor PI system. It will interrupt the Ebox when a PI request 
fr^xves on the bus and it will handle sending oat -fcrroadcast polling 
messages" and responding to device responses to the poll. 



4.3 Detailed Description 

The Mbox will incorporate a fast, Pipeljned ^Jji^^^/^^J^^^^Jf 
which will honor requests for major cycles from the Ebox and Ibox, 
as well as internally%enerated requests for "che sweeping and 
hnc;-oriainated cache zap requests. A raa^or cycle is defined as a 
flexible Seriod of time during which the "VMA bus" is driven from a 
elnnlP source In S typical operation, the Ebox would request a 
cycle onrtirk* (16 ?7 ns) ahead of when its VMA register wil be 
loaded The Mbox controller would issue a grant signal, MBOX 
G^NT"-which will cause the Ebox VMA to ^e gated onto the VMA ^ ^ 

-^ ^^^ .^^?^!:!TP!. J"!°^!fJ°l '^?ssu 5 Sr controll^r^i^uld 



^roc:;d!"Duriir?ie^-i? ?!crof tfe ra,^rc?^?:,^:?hrt^at!on would 
take place to award the next cycle. 

various conditions n^ig^t.interfere with this ^appy scenario 
If the desired word were not in the cache, the Mbox controller w 
Jitrelve it from main memory., depending on ^i^xng the "box may 
always issue a Bus request in speculation that it may be rq 
and then not use the cycle if not ^^eded. Memory requests 
general be 4 word -requests, as in the KL10 If ^°^^ ^^^^^^.g ^he 

ace not needed in the cache, the Mb°\^^i^J'^^^ it will only accept 
time the Mbox is waiting for memory to r^spond^ it will oniy y^ 

directed cycles to it from the ^f ^^^^^^ /"^gp^nses All other 

acknowledge and ^i--^^,-^^-Sge^J.^°^ii^!e^':^rtrng , the Mbox 

cycles will be ousy acunuwicuycu . ^^t-riac in the cache 

controller will write the apprXXf^ when this Is dSne, the 

directory, and the ^^^^^ . ^f . r'^i^J^'^^^^'a^d the Mbox controller will 
Ebox response signal will be held up, ana cne 

just count a timeout. 

..en the data arrives from the --^j;^,^nn^!cit!^n^?h^t1t^al 

P"' r^ hv%hfSbox(orpossE!?' without such indication), the Mbox 
accepted by the Ebox (or possiDxy wxu . , words into the cache, 
controller will proceed to write the received words in ^^ 

The desired word will be h^l^„e:;;^^^5 "?°^d DA?a" ^ore to come.] 
least until the Ebox asserts "Ebox RECEIVED data , mo^ 
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5.0 DETAILED PBOX INTERFACE DESCRIPTIONS 

5.1 Pbox Virtual References; Reads And Writes: 

The following are the signals passed between the Ebox/Ibox/FPA 

and the Mbox for Virtual (paged) read and write references. The 

Ebox IS the only device of the three that does any operation other 
than memory reads. 

* Note: physical references use the same interface signals as 
the virtual reference signals with the exception of the items marked 
above with an asterisk. 

1. * Ebox - "PHYS REF" (physical reference) false. 

2. * Pbox - "VMA BUS" as per figure 5.1. These enable 
checking of sub parts of the VMA BUS. 

3. Pbox - VMA bus parity - several parity bits over parts of 
the field. 

4. Pbox - "READ" - Asserted on or before VMA and Mbox request. 

5. Ebox - "WRITE" - Asserted on or before VMA and Mbox 
request. 

6. * Ebox - "WRITE TEST" - Mbox will test legality of a write 
before doing any function. This signal is guaranteed to 
only be asserted on virtual references. The Mbox will do a 
write test if and only if this signal is asserted. 

7. Pbox - "MBOX REQUEST" - This will be asserted not more than 
one Mbox clock tick before the VMA is valid. Request type 
signals such as "PHYS REF", "READ", "INTERLOCK", or "WRITE 
TEST" will be asserted on or before Mbox request time. 

Tentatively, the Mbox may begin processing a request 
before the previous request has been accepted by the Pbox 
by asserting "EBOX RECEIVED DATA". The Mbox output data 
will not be affected by this. 

8. Ebox - "EBOX DATA" - 36 bits of data plus ECC check bits, 
43 bits total. It must be valid at most one Mbox tick 
after "MBOX REQUEST". 

9. Pbox - "LOOK" - Look in the cache for this reference. (If 
a virtual reference, the Mbox ands "LOOK" with PT 
cacheable.) 

10. Pbox - "LOAD" - Load new data into the cache. Look must be 
true. (If a virtual reference, the Mbox ands "LOOK" and 
"LOAD" with PT cacheable.) 
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11 

12, 
13. 



Mbox ands It with (pt cacheable or PT writethrou|h ? 

1. "LOOK" and "LOAD" = or 

2. "PT CACHE" = or 

3. "PT WRITEBACK" = 1. 

''■ that It"mu?? h^''fSf?""°/'^^°: -Sin'ilar to Pbox read except 
mkJ! It must be followed up by a regular request before the 

wn^ ""lil l^T.""^ ^^^^^ "^^°^ RESPONSE". Also, the Mbox 
will, abort the servicing of the request it any other 
request comes along and the Mbox will niver reques? main 
memory solely on the basis of a "PBOX BACKGROUND READ" . 

nr JJr "PBOX BACKGROUND READ" was seen by the Mbox three 
hL I K f ! before a regular read request and the Mbox 
had not aborted it, and the page table cache entry was 

l^ll L^^ .u ^ '^f^^ ''^^ '" ^^^ "^^^«' then on the clock 
ticK atter the regular request, "MBOX RESPONSE" will be 
asserted and the data will be available on the "MBOX OUT" 
lines. 

nm^I'ov^ provides a means for the Pbox to maximize 
utilization of the Mbox without some of the serious 
contention problems that would otherwise result. 

ho?L" "^'fP^ ^^"^^ ^^^^°" - Asserted by Ebox when the data 
being written is available at the Ebox Data Out Lines. 

Pbox - "ABORT MBOX CYCLE" (As in KL10 AC Ref) This may be 
asserted at almost any time during any cycle. Exception 
cases will be specified later. It will also be used by the 
Ebox to clear a page fail state. 

17. Mbox - "MBOX GRANT" - There is one Mbox Grant signal line 
«2L!^^i device which may access the VMA BUS. For example, 
"MBOX GRANT IBOX". It enables the highest prior it^ 
requestor of the Mbox to access the VMA bus. Bus data is 
read onto the VMA bus at the start of the next Mbox clock. 

18. Mbox - -MBOX DATA" - 36 bits of data plus ECC check bits, 
43 bits total. Valid when "MBOX RESPONSE" true. 



15 
16, 
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19 



20 



21. 



Mbox - "MBOX RESPONSE" - On write? Mbox received 
data. On read; Data available at output latch. 



wr ite 



Mbox - "PAGE FAIL" - This signal is the catch all for most 
exception conditions including page table access failures, 
processor interrupts, ECC errors and NXM traps. It will 
stay asserted until the Ebox asserts "ABORT MBOX CYCLE". 
It is enabled by "Mbox request" so the Ebox must do an 
"ABORT MBOX CYCLE" before it's first Mbox request after the 
page fail or else the Ebox will page fail again. The 
following conditions result in page fail: 

1. * Page Table cache no match. 



2. 
3. 
4. 



* Page table write access failure. 

* Page table Written state transition. 



Processor automatic 
KS10, instead of 
interrupts, the Mbox 
next it does an Mbox 



priority interrupt. As in the 

having the microcode check for 

will page fail the microcode when 
request. 



5. ECC or parity error. 

6. NXM (Non-existant memory) error. 

7. Incomplete memory cycle. 

Mbox - Five or six bit problem type code that indicates the 
type of failure for the above sources of page failure. 
This problem type code will be used by the Ebox microcode 
to dispatch to the appropriate service routine. References 
which receive a page fail will not get a Mbox response. 

Figure 5.1 

VMA bus Virtual Reference Format 







3 4 5 6 



I 

IMust I 1 ! 
1 be 10!U! 
Izero ! I ! 

I 



17 18 



26 27 



Virtual 
Section 



Virtual 
Page 



Virtual 
Line 



35 
-! 



I 



Note: on a virtual 
mask. Also, the 
Ebox/Xbox. 



reference, the Mbox will generate a zero byte 
Mbox must provide a write test function for the 
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5.2 Ebox Physical References; Reads/writes: 

* Note: physical references use the same inter/ace signals as 
the virtual reference signals with the exception of 'the items marked 
above with an asterisk. 

All others above plus the following apply. 

1. Ebox - "Physical Reference" true 

2. Ebox - VMA bus as per figure 5.2 below. 

Figure 5.2 
VMA bus Physical Reference Format 

^34789 17 18 26 27 35 

I • , 

!Must ! Ill I 

! be ! Mask!/! Physical Address ! 

Izero ! !0! i 

! .. . _ I 



5.3 Ebox Page Table And PT Directory Read Special Function. 

Page table read supplies the page table address on bits 18 to 
26 of the VMA BUS. Both the page table and the page table directory 
data are returned on the MBOX DATA linesin the format specified by 
figure 5.3 below. 

Figure 5.3 
Page Table Cache Read/Write Format 

4 5 6 8 9 26 27 35 

f , , 

! Access !U! PTi Physical page number 1 Page ! 

! state 'SlDirl ! table ! 

1 bits 1E16-B1 9-26 ! directory ! 

! iRl I ! 9 - 17 ! 
J 1 
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5.4 Ebox Page Table And PT Directory Write Special Function. 

The Ebox supplies the page table address on ^it^ ^B^to 26 of the^VMA 

BUS. The data for the fP^^^^^^^the EBOX DATA OUT lines ^ith the 
directory entry is supplied on the EBOX 
format of figure 5.3 above. 



5.5 Sweep Functions 



■ ^ r.f :. cianal "SWEEP REFERENCE" 
below and one or both from group 2 below. 



Group 1 - 

1. Sweep one word 

2. Sweep one four word block 

3. Sweep one page 

4. Sweep all pages 

Group 2 - 

1. Invalidate cache 

2. Validate memory 



5.6 PI System 

design is 

1^*0TE 

evolving - 
change . 



Therefore, the toii-owj-ny 



the nature 
the Mbox, 



The PI syste. in the S°lP^i^,i=/^f ^j^rfninlaUon! ^ ^ 
of the unified bu= design. Because °f ^"J^ ^^^^j^ all control 
«hlch is the interface to the Dolphin bus, ^^^^^^ ^^a the Ebox's 

Sunrthe^^orShfch^ere^previorsly handled by the KU. Bbus. 
ISO allow the Ebox to get a £us cycle sxm^^^ ^^^ ^^^^ ^^ generate a 
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comparing the ID of an interrupting device to a saved ID field. 

When an interrupt or other message for the CPU is received on 
the bus, the Mbox will page fail the Ebox. [More detail later on 

this.] 

The Ebox will be able to read the PI level (on bits 33-35 of 
"MBOX DATA") . 



6.0 DETAILED BUS INTERFACE DESCRIPTION 

The Mbox will use the standard 11 MCA Dolphin bus interface 
chip set. The Mbox data path will provide the receive buffer space. 
The Mbox microcontroller will initiate all bus operations and handle 
all incoming bus requests and data returns. For more information, 
see the Dolphin Bus Specification. 



7.0 DETAILED CACHE FUNCTIONAL DESCRIPTION 

To be supplied. See KL10 hardware maintenance manual. The 
Cache will use the same structure and algorithms as the KL10 cache. 
State control will be provided by the Mbox 'nic'^oc°|i^^°\^" ' , ^"^^ 
chip). MCAs involved in the cache control are the CSH, VAW and MAD 
chips plus the microcontroller. 



7.1 Cache Sweeps 

Cache sweeps will be done by the Mbox microcontroller. They will be 
significantly faster than the KL10 and will continue to be able to 
rufasySchronously to and in parallel with the Pbox. [See interface 
sections for more information.] 



7.2 Cache Writeback/refill 

Cache writeback/refill will be controlled by the Mbox 
microcontroller. 

7.3 Cache Invalidate Word/block Per Bus Request. (Cache Zap). 

cache invalidate functions will be ^^^^^^^^/^^ ,. ^^ ^^Jterface 
microcontroller, which will handle commands from the bus interface 
and service them by clearing valid and written bits m one to 



wo 



rds of cache directory. 
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8.0 DETAILED PAGE TABLE FUNCTIONAL DESCRIPTION 

All of the required state information for access of a page will 
be encoded into three bits. This includes the states involving the 
access, writeable, modified, cacheable and writeback conditions in 
the PT directory. They were to be encoded into three bits (eight 
states total) as follows: 

1. Page Not in hardware PT (not valid) 

2. Not cacheable, not writeable 

3. Not cacheable, writeable, not written 

4. Not cacheable, writeable, written 

5. Cacheable, not writeable 

6. Cacheable, writeable, not written 

7. Cacheable, writeable, written 

8. Writethrough (= Cacheable, writeable, written and 

Writethrough) 

A fourth bit which is independent of the above states, would 
indicate that a CST update is needed on next refill of the page 
table. 

[More Details to be supplied.] 



9.0 FUNCTIONS NOT PERFORMED BY THE MBOX 

9.1 All Handling Of The UBR And EBR. 

The UBR and EBR are kept in a scratch pad by the Ebox. It is 
solely responsible for their maintenance. 



9.2 The Map Instruction. 

The Map instruction will be implemented in the microcode by 
tracing through page tables in memory. There is no Mbox assistance 
and therefore, it will be slower than in the KL10. 
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9.3 Sbus Diag Functions 

rN ■, ^.^'^^ functions currently equivalent 
Dolphin bus I/O reads and writes. 



to "Sbu,5 Diag" become 



10.0 PERFORMANCE 

10.1 Overall Goal And Committment 

♦-h.^ '^K^ performance of the Mbox in terms of overall system 
throughput IS almost solely a function of the cache access time 
(The second order effect is Main memory access time and a third 

Sborisior^ nlL ^^^ ^^^!' '^^i^^ ^^"^- ""^^ 9oal for the Dolphin 
n,^J r! 1 f^""^® access time of 50 nanoseconds. This yields a 
most likely logic mix performance of 2.1 times a 2050 at 93% cache 

n X w IT 3 CG • 



10.2 Page Table 



The page table hit rate will be somewhat better than the KL10 
tor an equivalent software environment in that there are 512 
directory entries versus 128 in the KL10. This will somewhat 
counteract the effects of the increased number of 
software. 



sections used by 



10.3 Section Pointer Cache 

The section pointer cache is provided and controlled 

microcode using 

specification.] 



by the Ebox 
the Ebox's scratch RAM. [See Ebox functional 
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11.0 R.A.M.P. FEATURES 

11.1 Single Error Correction ("ECC") .. 

No single RAM error will be able to crash, the system. 

Almost all Random access memories, address and data paths carry 
ECC. The exceptions are the VMA bus, which will carry parity, the 
use bits and history bits, which will have parity and which can not 
cause fatal errors to the system, only lower cache hit rates, and 
the Valid and Written bits. The Valid and Written bits will have 
multiple error correction provided that the operating system can 
tolerate the writing back to memory from the cache of a word that 
had not been written by the processor since it was last read into 
the cache. If the operating system can not tolerate the writeing 
back of a non-written word, a written bit single error will be 
handled by the monitor similarly to current parity errors - recover, 
crash user, or crash system depending on the situation. In almost 
all cases, the error will be non-fatal. Unchecked random control 
logic will be held to the barest minimum amount possible. [See the 
Dolphin RAMP plan for more information] 



11.2 Error Correction Method 

Due to the need to use data before the error can be detected, 
the correction method will be to retry the reference with slower 
clocking to allow time for the syndrome correction to propagate 
through the longer path. 



11.3 Error Logging 

All correctable and uncorrectable errors will be reported to 
the Ebox microcode by means of an page fail. 

Uncorrectable errors may temporarily stop the Mbox clock after 
latching information on the failure and interrupting the maintenance 
console. (This would not affect the Memory clocking.) The console 
will poll the diagnostic logic to find out the failure type and 
gather all possible state data and then will continue or restart the 
Mbox. Error state information will be latched in registers where 
applicable. [More to come.] 

I propose that the Ebox microcode have a "logout" area in main 
memory where it may put the information for the use of the monitor. 
The Mbox may have functions which allow the microcode to disable the 
interrupt for correctable errors. (Or the Ebox may be told by the 
operating system to ignore them.) This will reduce the amount of 
repetitive information in the system error file and reduce error 
handling overhead. 
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The Mbox microcode may implement error retry for selected fault 
conditions (other than correctable ECC errors) . 



12.0 COST 

To be supplied - See Vic Ku's system cost breakdown. 

13.0 EXPANDABILITY 

13.1 Cache 

The cache may be expandable from the minimum 2K to 4K or 8K 
words if there is significant performance improvement with resonable 
differential performance change. Present calculations indicate a 
logic performance improvement of less than 5% for a 2.5% increase in 
manufacturing cost. This is based on doubling the cache size using 
the same RAMs. (There are two problems with cache expansion - 
either the page table must be cycled in series with the cache to 
provide more address bits for the cache (this may require a faster 
256 bit memory) , or a greater level of associativity must be provide 
which makes the implementation of a Least Recently Used refill 
algorithm much more difficult. Use of simpler algorithms such as 
pseudo-random has not been investigated.) 



13.2 Physical Address Space 

No effort will be made to make the physical address space 
expandable . 



13.3 Multiple Processors 

The goal is to allow configuration of up to four processors 
under both TOPS-10 and TOPS-20. 

A scheme to allow cacheing of shared pages by multiple 
processors will be implemented. The scheme will provide automatic 
clearing of invalid data from the caches of multiple processors. 
For more detailed information on this scheme, see Alan Kotok's memo 
of 20 July 1978r "Shared Pagess in Multiprocessor Systems" 
(SHARED2.MEM) . 



n 



